AITopics | online policy selection

Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations

Neural Information Processing SystemsFeb-16-2026, 09:29:35 GMT

We study the problem of online adaptive policy selection for nonlinear time-varying discrete-time dynamical systems.

artificial intelligence, machine learning, online policy selection, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > United States > California > Los Angeles County > Pasadena (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations

Neural Information Processing SystemsDec-26-2025, 12:22:38 GMT

We study online adaptive policy selection in systems with time-varying costs and dynamics. We develop the Gradient-based Adaptive Policy Selection (GAPS) algorithm together with a general analytical framework for online policy selection via online optimization. Under our proposed notion of contractive policy classes, we show that GAPS approximates the behavior of an ideal online gradient descent algorithm on the policy parameters while requiring less information and computation. When convexity holds, our algorithm is the first to achieve optimal policy regret. When convexity does not hold, we provide the first local regret bound for online policy selection. Our numerical experiments show that GAPS can adapt to changing environments more quickly than existing benchmarks.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.80)

Add feedback

51200d29d1fc15f5a71c1dab4bb54f7c-Supplemental.pdf

Neural Information Processing SystemsNov-14-2025, 00:42:36 GMT

artificial intelligence, ithrdtirq 7rdiqiqj, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

Neural Information Processing SystemsNov-14-2025, 00:42:28 GMT

Offline methods for reinforcement learning have a potential to help bridge the gap between reinforcement learning research and real-world applications.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Industry: Leisure & Entertainment > Games (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

51200d29d1fc15f5a71c1dab4bb54f7c-AuthorFeedback.pdf

Neural Information Processing SystemsNov-14-2025, 00:42:17 GMT

We would like to thank our reviewers for their thoughtful comments and feedback. However, to preserve anonymity, we can not share the link to the repository. Our most challenging tasks are locomotion tasks, which are not well suited for human demonstrations. But we believe this is an important direction for research as well. We will add this rationale to the paper.

artificial intelligence, dataset, policy selection, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

a7a7180fe7f82ff98eee0827c5e9c141-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 03:59:21 GMT

artificial intelligence, machine learning, online policy selection, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
North America > United States > California > Los Angeles County > Pasadena (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.68)

Add feedback

Supplementary material A Detailed description of baselines A.1 Continuous Baselines

Neural Information Processing SystemsOct-2-2025, 22:16:24 GMT

Multivariate Gaussian distributions, which is used as the final policy output. Humanoid experiments, the data consists of very diverse way of running). In Table 5, we show the hyperparameters shared among our baselines. Distributed Distributional Deep Deterministic Policy Gradient [ Barth-Maron et al., 2018 ] We used batch size 1024 for the experiments. Behavior Regularized Actor Critic [ Wu et al., 2019 ] is an actor critic algorithm where the We use the exact same network architecture as described in the original paper.

artificial intelligence, hyperparameter, machine learning, (16 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

51200d29d1fc15f5a71c1dab4bb54f7c-Paper.pdf

Neural Information Processing SystemsOct-2-2025, 22:16:18 GMT

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Leisure & Entertainment > Games (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

51200d29d1fc15f5a71c1dab4bb54f7c-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 22:16:06 GMT

artificial intelligence, dataset, policy selection, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations

Neural Information Processing SystemsJan-19-2025, 18:13:46 GMT

We study online adaptive policy selection in systems with time-varying costs and dynamics. We develop the Gradient-based Adaptive Policy Selection (GAPS) algorithm together with a general analytical framework for online policy selection via online optimization. Under our proposed notion of contractive policy classes, we show that GAPS approximates the behavior of an ideal online gradient descent algorithm on the policy parameters while requiring less information and computation. When convexity holds, our algorithm is the first to achieve optimal policy regret. When convexity does not hold, we provide the first local regret bound for online policy selection.

adaptive policy selection, artificial intelligence, machine learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Collaborating Authors

online policy selection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations

Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations

51200d29d1fc15f5a71c1dab4bb54f7c-Supplemental.pdf

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

51200d29d1fc15f5a71c1dab4bb54f7c-AuthorFeedback.pdf

a7a7180fe7f82ff98eee0827c5e9c141-Paper-Conference.pdf

Supplementary material A Detailed description of baselines A.1 Continuous Baselines

51200d29d1fc15f5a71c1dab4bb54f7c-Paper.pdf

51200d29d1fc15f5a71c1dab4bb54f7c-AuthorFeedback.pdf

Online Adaptive Policy Selection in Time-Varying Systems: No-Regret via Contractive Perturbations